Enhancing P2P File-Sharing with an Internet-Scale Query Processor

نویسندگان

  • Boon Thau Loo
  • Joseph M. Hellerstein
  • Ryan Huebsch
  • Scott Shenker
  • Ion Stoica
چکیده

In this paper, we address the problem of designing a scalable, accurate query processor for peerto-peer filesharing and similar distributed keyword search systems. Using a globally-distributed monitoring infrastructure, we perform an extensive study of the Gnutella filesharing network, characterizing its topology, data and query workloads. We observe that Gnutella’s query processing approach performs well for popular content, but quite poorly for rare items with few replicas. We then consider an alternate approach based on Distributed Hash Tables (DHTs). We describe our implementation of PIERSearch, a DHT-based system, and propose a hybrid system where Gnutella is used to locate popular items, and PIERSearch for handling rare items. We develop an analytical model of the two approaches, and use it in concert with our Gnutella traces to study the tradeoff between query recall and system overhead of the hybrid system. We evaluate a variety of localized schemes for identifying items that are rare and worth handling via the DHT. Lastly, we show in a live deployment on fifty nodes on two continents that it nicely complements Gnutella in its ability to handle rare items.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

P2P Network Trust Management Survey

Peer-to-peer applications (P2P) are no longer limited to home users, and start being accepted in academic and corporate environments. While file sharing and instant messaging applications are the most traditional examples, they are no longer the only ones benefiting from the potential advantages of P2P networks. For example, network file storage, data transmission, distributed computing, and co...

متن کامل

Distributed caching in unstructured peer-to-peer file sharing networks

Nowadays, the peer-to-peer (P2P) system is one of the largest Internet bandwidth consumers. To relieve the burden on Internet backbone and improve the query and retrieve performance of P2P file sharing networks, efficient P2P caching algorithms are of great importance. In this paper, we propose a distributed topology-aware unstructured P2P file caching infrastructure and design novel placement ...

متن کامل

Query Routing and Processing in Peer-To-Peer Data Sharing Systems

Sharing musical files via the Internet was the essential motivation of early P2P systems. Despite of the great success of the P2P file sharing systems, these systems support only "simple" queries. The focus in such systems is how to carry out an efficient query routing in order to find the nodes storing a desired file. Recently, several research works have been made to extend P2P systems to be ...

متن کامل

A Resource Exchange Architecture for Peer-to-Peer File Sharing Applications

A peer-to-peer (P2P) file sharing network provides a resource sharing platform for Internet users. To increase higher-degree resource sharing, heterogeneous P2P file sharing networks need a way to collaborate and communicate with each other. Based on the approach of interconnecting heterogeneous P2P file sharing networks, users on one P2P file sharing network can share resources and search data...

متن کامل

Searching in Variably Connected P2P Networks

Peer-to-Peer networks are gaining popularity through file-sharing communities. Most P2P networks demand a certain stability from it’s nodes in order to function satisfactory. A variably connected P2P network, however, is a network where the connectivity of nodes might vary greatly over time. The nodes can be in different connection states, such as connected to the Internet or moving between net...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004